Stochastic Streams: Sample Complexity vs. Space Complexity

نویسندگان

  • Michael S. Crouch
  • Andrew McGregor
  • Gregory Valiant
  • David P. Woodruff
چکیده

We address the trade-off between the computational resources needed to process a large data set and the number of samples available from the data set. Specifically, we consider the following abstraction: we receive a potentially infinite stream of IID samples from some unknown distribution D, and are tasked with computing some function f(D). If the stream is observed for time t, how much memory, s, is required to estimate f(D)? We refer to t as the sample complexity and s as the space complexity. The main focus of this paper is investigating the trade-offs between the space and sample complexity. We study these trade-offs for several canonical problems studied in the data stream model: estimating the collision probability, i.e., the second moment of a distribution, deciding if a graph is connected, and approximating the dimension of an unknown subspace. Our results are based on techniques for simulating different classical sampling procedures in this model, emulating random walks given a sequence of IID samples, as well as leveraging a characterization between communication bounded protocols and statistical query algorithms. 1998 ACM Subject Classification F.2.3 Tradeoffs between Complexity Measures

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anomaly Detection in Hierarchical Data Streams under Unknown Models

We consider the problem of detecting a few targets among a large number of hierarchical data streams. The data streams are modeled as random processes with unknown and potentially heavy-tailed distributions. The objective is an active inference strategy that determines, sequentially, which data stream to collect samples from in order to minimize the sample complexity under a reliability constra...

متن کامل

The Effect of Task Complexity on EFL Learners’ Narrative Writing Task Performance

This study examined the effects of task complexity on written narrative production under different task complexity conditions by EFL learners at different proficiency levels. Task complexity was manipulated along Robinson’s (2001b) proposed task complexity dimension of Here-and-Now (simple) vs. There-and-Then (complex) in. Accordingly, three specific measures of the written narratives were targ...

متن کامل

Time and Space Complexity Reduction of a Cryptanalysis Algorithm

Binary Decision Diagram (in short BDD) is an efficient data structure which has been used widely in computer science and engineering. BDD-based attack in key stream cryptanalysis is one of the best forms of attack in its category. In this paper, we propose a new key stream attack which is based on ZDD(Zero-suppressed BDD). We show how a ZDD-based key stream attack is more efficient in time and ...

متن کامل

The Impact of Pre-task Planning Vs. On-line Planning on Writing Performance: A Test of Accuracy, Fluency, and Complexity

The aim of the current study was to compare the influence of on-line planning and pre-task planning on the performance of EFL university students enjoying different levels of proficiency regarding accuracy, fluency and complexity. To this end a group of 134 EFL learners with different proficiency levels were asked to write narrative tasks under two planning conditions (Pre-task planning and on-...

متن کامل

The Impact of Mediational Artifact Types on EFL Learners’ Writing Complexity: Collaboration vs. Asynchronous Artifacts

The present study was an attempt to investigate the significance of environmental changes on the develo p- ment of writing in English as a Foreign Language (EFL) context with respect to the individual. This study also compared the impacts of collaboration and asynchronous computer mediation (ACM) on the writing complexity of EFL learners. To this end, three intact writing classes were designate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016